Fine-Grained Hidden Markov Modeling for Broadcast-News Story Segmentation
نویسندگان
چکیده
We present the design and development of a Hidden Markov Model for the division of news broadcasts into story segments. Model topology, and the textual features used, are discussed, together with the non-parametric estimation techniques that were employed for obtaining estimates for both transition and observation probabilities. Visualization methods developed for the analysis of system performance are also presented.
منابع مشابه
Text segmentation and topic tracking on broadcast news via a hidden Markov model approach
Continuing progress in the automatic transcription of broadcast speech via speech recognition has raised the possibility of applying information retrieval techniques to the resulting (errorful) text. In this paper we describe a general methodology based on Hidden Markov Models and classical language modeling techniques for automatically inferring story boundaries (segmentation) and for retrievi...
متن کاملSegmentation and Indexation of Broadcast News
This paper describes a topic segmentation and indexation system for broadcast news that is integrated in an alert system for selective dissemination of multimedia information. The goal of this work is to enhance the retrieval and navigation through specific spoken audio segments that have been automatically transcribed, using speech recognition. Our segmentation algorithm is based on simple heu...
متن کاملIndexing Broadcast News
This paper describes a topic segmentation and indexation system for broadcast news that is integrated in an alert system for selective dissemination of multimedia information. The goal of this work is to enhance the retrieval and navigation through specific spoken audio segments (stories) that have been automatically transcribed, using speech recognition. Our segmentation algorithm is based on ...
متن کاملSegmenting Broadcast News Streams using Lexical Chains
In this paper we propose a course-grained NLP approach to text segmentation based on the analysis of lexical cohesion within text. Most work in this area has focused on the discovery of textual units that discuss subtopic structure within documents. In contrast our segmentation task requires the discovery of topical units of text i.e. distinct news stories from broadcast news programmes. Our sy...
متن کاملA hidden Markov model approach to text segmentation and event tracking
Continuing progress in the automatic transcription of broadcast speech via speech recognition has raised the possibility of applying information retrieval techniques to the resulting (errorful) text. For these techniques to be easily applicable, it is highly desirable that the transcripts be segmented into stories. This paper introduces a general methodology based on HMMs and on classical langu...
متن کامل